Picture for Xing Yue

Xing Yue

Phun-Bench: Evaluating LLMs on Phonological Understanding in Chinese

Add code
Jun 05, 2026
Viaarxiv icon

Beyond Rubrics: Exploration-Guided Evaluation Skills for Reward Modeling

Add code
Jun 05, 2026
Viaarxiv icon

CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability

Add code
Feb 03, 2026
Viaarxiv icon